Faster Regular Expression Matching

نویسندگان

  • Philip Bille
  • Mikkel Thorup
چکیده

Regular expression matching is a key task (and often the computational bottleneck) in a variety of widely used software tools and applications, for instance, the unix grep and sed commands, scripting languages such as awk and perl, programs for analyzing massive data streams, etc. We show how to solve this ubiquitous task in linear space and O(nm(log log n)/(log n)+n+m) time where m is the length of the expression and n the length of the string. This is the first improvement for the dominant O(nm/ log n) term in Myers’ O(nm/ log n+(n+m) log n) bound [JACM 1992]. We also get improved bounds for external memory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Motif Search in Protein Sequence Databases

Regular expression pattern matching is widely used in computational biology. Searching through a database of sequences for a motif (a simple regular expression), or its variations is an important interactive process which requires fast motif-matching algorithms. In this paper, we explore and evaluate various representations of the database of sequences using suffix trees for two types of query ...

متن کامل

A Regular Expression Matching Approach to Distributed Wireless Network Security System

There is a growing demand for wireless ad hoc network systems in examining the content of data packages in order to improve network security and application service. Whereas, each distributed wireless node has limited memory and computing power. Since regular expressions offer superior expression power and flexibility, taking advantage of distributed nodes and regular expression collaboratively...

متن کامل

Survey of Global Regular Expression Print ( GREP ) Tools

The UNIX grep utility marked the birth of a global regular expression print (GREP) tools. Searching for patterns in text is important operation in a number of domains, including program comprehension and software maintenance, structured text databases, indexing file systems, and searching natural language texts. Such a wide range of uses inspired the development of variations of the original UN...

متن کامل

Approximate Regular Expression Searching with Arbitrary Integer Weights

We present a bit-parallel technique to search a text of length n for a regular expression of m symbols permitting k differences in worst case time O(mn/ logk s), where s is the amount of main memory that can be allocated. The algorithm permits arbitrary integer weights and matches the complexity of the best previous techniques, but it is simpler and faster in practice. In our way, we define a n...

متن کامل

An Innovative Approach for Regular Expression Matching Based on NoC Architecture

An new regular expression (regex) matching method based on Network-on-Chip(NoC) architecture is proposed in this paper. The idea is to combine a new kind of regex matching engine implemented in hardware with NoC architecture to get a high matching rate. The Regex matching was performed by partitioning the regex into several parts to make the finite state machine (FSM) simpler. Each part of rege...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009